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One of the great challenges in refining macromolecular crystal 
structures is a low data-to-parameter ratio. Historically, 
knowledge from chemistry has been used to help to improve 
this ratio. When a macromolecule crystallizes with more than 
one copy in the asymmetric unit, the noncrystallographic 
symmetry relationships can be exploited to provide additional 
restraints when refining the working model. However, 
although globally similar, NCS-related chains often have local 
differences. To allow for local differences between NCS- 
related molecules, flexible torsion-based NCS restraints have 
been introduced, coupled with intelligent rotamer handling for 
protein chains, and are available in phenix. refine for refine- 
ment of models at all resolutions. 

1. Introduction 

One of the great challenges in macromolecular crystallo- 
graphy is improving the data-to-parameter ratio. When crys- 
tals contain multiple copies of the same molecule or complex 
within the asymmetric unit, it is reasonable to assume that 
these related entities will generally adopt similar, if not 
identical, conformations. These noncrystallographic symmetry 
(NCS) relationships have been used previously to address 
the phasing problem (Rossmann, 1972; Bricogne, 1976). For 
refinement, restraints that maintain similarity between related 
atomic positions can be added to the geometry target function, 
introducing correlations between refined parameters. Alter- 
natively, constraint-based approaches improve the data-to- 
parameter ratio by requiring NCS-related regions to be 
identical, such as the methods described in Hendrickson 
(1985) and Kleywegt (1996). However, the use of constraints 
often inappropriately enforces structural identity where there 
are local structural differences, which is particularly obser- 
vable at high resolution (~1.8 A or better). To address this 
issue, NCS restraints for structure refinement have previously 
been implemented in a variety of crystallographic refinement 
programs, including PROLSQ (Hendrickson, 1985), TNT 
(Tronrud et al, 1987), FROG (Urzhumtsev et al, 1989), CNS 
(Bninger et al., 1998), SHELX (Sheldrick, 2008), BUSTER 
(Bricogne et al., 2010), REFMAC (Murshudov et al., 2011) and 
phenix. refine (Afonine et al., 2012). 

Many past implementations of NCS restraints enforced 
global similarity between groups, effectively treating the 
symmetry relationships as rigid. Even at moderate resolution, 
however, differences between NCS-related copies are often 
supported by crystallographic data, and must be taken into 
account to maximize the effectiveness of such restraints 
without overfitting of the data by the model (Uson et al., 1999). 
One may simply remove NCS restraints for groups with clear 
conformational difference, but this approach typically requires 
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careful manual inspection of the model and maps, and is 
not easily automated. Moreover, the inflexibility of global 
restraints during refinement prevents convergence when 
starting from identical copies, for instance after molecular 
replacement. Alternative approaches have been implemented 
for NCS restraints, which instead restrain local conformation 
(Uson et al, 1999; Sheldrick, 2008). More recently, local NCS 
implementations in both REFMAC (Murshudov et al, 2011) 
and BUSTER (Smart et al, 2008, 2012) use local similarity 
restraints based on distances between atoms nearby in space. 
In each implementation, these restraints resemble a simple 
harmonic near the target values, tapering off to no restraint as 
the distance increases, similar to the reference-model restraint 
implementations in BUSTER (Smart et al, 2012), REFMAC5 
(Nicholls et al., 2012) and phenix.refine (Headd et al, 2012). 

Here, we discuss an implementation of local similarity- 
based NCS restraints that use torsion angles rather than local 
atomic distances. Torsion angles are chosen for their well 
understood relationship to macromolecular folding, i.e. 
correlated cp/x// conformations (Ramachandran & Sasise- 
kharan, 1968), amino-acid side-chain rotamers (Lovell et al, 
2000), and RNA backbone conformations (Richardson et al, 
2008) etc. , which allow a limited number of restraints to govern 
the coordinated movement of related structural elements. 

2. Methods 

Torsion NCS restraints in phenix.refine are implemented using 
the same torsion-based 'top-out' potential as described in 
Headd et al. (2012). Briefly, a torsion restraint is added for 
each NCS-related torsion angle in the working model. In 
proteins, this set of angles includes all protein side-chain x 
angles and the backbone cp, \j/ and co angles. Improper dihedral 
C-N-C a -C^ and N-C-C*-C^ restraints are also added 
for each protein residue to preserve C^ geometry, with each 
torsion restrained to the ideal value for the given residue type 
(Lovell et al, 2003). For nucleic acids, this set of torsion angles 
includes all seven backbone torsions, as well any defined base 
X angles. Only macromolecules (protein and/or nucleic acid) 
are handled in the current implementation. Non-standard 
amino acids and nucleic acids are also supported auto- 
matically. It should also be noted that explicit torsion 
restraints are an improvement over the 1,4-distance-based 
approach described in Uson et al. (1999), resolving the 180° 
ambiguity that exists for some 1,4 distances, such as x 2 for the 
p90 and p— 90 Trp rotamers. 

As discussed in Headd et al. (2012), the 'top-out' potential 
is defined by a and limit parameters, with the latter para- 
meterized in degrees to control at what difference between 
related torsions the target is smoothly reduced to zero. Our 
implementation is similar to the Welsch robust estimator 
function (Dennis & Welsch, 1978), and is conceptually similar 
to the local NCS potentials implemented in REFMAC5 
(Murshudov et al, 2011) and BUSTER (Smart et al, 2012). 
The target for each set of NCS-related torsions is defined to be 
their average (except as noted below), which is updated after 
each refinement step that moves individual sites, including 



real-space refinement, reciprocal-space refinement and Asn/ 
Gin/His side-chain orientation correction. The residuals for 
the torsion NCS restraints are calculated using the following 
'top-out' functional form: 

£ to ,ai = E£ ; . (i) 

i=l 




where A, is the difference between the ith torsion and its 
NCS-related average, a is a user-definable standard deviation 
parameter, / is the limit parameter and n is the total number of 
added reference restraints. It should be noted that the average 
of two torsion angles is calculated by taking the tangent of the 
quotient of their average sines and cosines. 

Atomic displacement parameters (ADPs) for NCS-related 
atoms may be restrained using the same parameterization as 
for global, Cartesian-based NCS restraints, but we have found 
that allowing the ADPs to be refined independently typically 
results in improved R factors (data not shown), as NCS- 
related chains will often have considerably different ADPs. 
This observation is consistent with previous reports (Smart 
et al, 2012), and is also supported by the variation in TLS 
(translation, libration, screw-rotation) models observed for 
NCS-related chains in some cases (Burnley et al, 2012). 

For this manuscript, all refinements in phenix.refine were 
carried out using Phenix v.l. 8.4-1496. 

2.1. Knowledge-based rotamer correction for protein side 
chains 

Torsion-angle parameterization of NCS restraints allows the 
inclusion of additional prior knowledge of protein geometry. 
To this end, we identify the rotameric state of each protein 
side chain (standard amino acids only), and only restrain 
matching x angles that are in the same rotameric state. For 
side chains with multiple x angles (such as Lys), we match 
successive x angles out from the main chain as long as they 
match, while not restraining any x angles past the first angle 
that differs. For example, if two matching Lys residues were in 
mmtt and mmmt rotamer states (for a discussion of rotamer 
nomenclature, see Lovell et al, 2000), respectively, only Xi and 
X2 would be restrained. Parameterizing the restraints in this 
manner avoids restraining the Xa angle, which may have a 
similar torsion angle but is in a different rotamer state. 
Further, we do not allow rotamer outliers, Ramachandran 
outliers (cp and i// angles) or peptide outliers (co angle >30° or 
<— 30°) to contribute to NCS target values. If such outliers are 
matched to one or more NCS-related torsions with an allowed 
conformation, then the target value is calculated from the set 
of allowed values, and the outlier is restrained to this value. If 
during the course of refinement these outliers are resolved to 
allowed conformations, their torsion values will contribute to 
the NCS average target calculation during the next macro- 
cycle (Afonine et al, 2012). 
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2.1 .1 . Rotamer correction. For proteins, knowledge of side- 
chain rotamer distributions (Lovell et al , 2000) can be used to 
identify and correct rotamer outliers at each macro-cycle. At 
high resolution (roughly 1.8 A or better), density shape alone 
is often sufficient to correctly fix rotamer errors. As shown in 
Headd et al (2009), however, the ability to confidently accept 
such corrections drops off sharply below 2.5 A resolution. 
Using the knowledge of NCS, we can limit the scope of 
rotamer searching to only attempt to fit rotamer states 
observed in NCS-related copies of the same residue. By 
limiting the search space, we are able to extend our ability to 
accept or reject candidate corrections at lower resolution (as 
low as 3.0 A by default) through a priori knowledge of what 
the conformation should be. While the scope of this search 
would miss the case where all side chains in an NCS-related set 
should be an unrepresented rotamer, it also limits errors that 
could arise by introducing new rotamer possibilities that are 
not supported by NCS information. The side-chain fitting 
protocol is similar in concept to the Autofix' routine presented 
in Headd et al (2009), but is limited to a 6° 'backrub' search 
(Davis et al, 2006) followed by progressive 10° x _an gl e 
sampling. The backbone conformation must be sampled to 
correct any backbone distortions brought about by the 
incorrect starting orientation. The 'backrub' is described as a 
rotation which rotates all atoms between two flanking C a 
atoms as a rigid body. Such motions have been shown to be 
necessary in conformational sampling in protein design 
(Georgiev et al, 2008; Keedy et al, 2012). The full protocol is 
shown in Fig. 1. Corrections are rejected if any significant 
steric clashes are introduced with the surrounding model, and 
it is also required that at least one additional side-chain atom 
has a real-space correlation coefficient greater than or equal to 
l.Ocr. These strict requirements provide greater confidence 
that corrected rotamers are a reasonable steric fit for the 
model, as well as an improved fit to the experimental data. 

2.1.2. Rotamer consistency. In cases where all NCS-related 
side chains are rotameric, but not all in the same rotamer state, 
it is possible that one or more side chains are misfitted. To 
identify and correct such cases, each side chain is tested 
against all possible candidate rotamers, with the best fit being 
accepted in each case following the above protocol. 

To test the effectiveness of rotamer outlier and consistency 
correction, we re-refined the 1.7 A resolution triosephosphate 
isomerase structure (PDB entry ImOo; Symersky et al, 2003), 
which consists of two NCS-related copies in the asymmetric 
unit. In the deposited model, the IleA120 side chain has been 
incorrectly fitted as a tt rotamer. As shown in Fig. 2(a), the tt 
conformation is not a good fit to the 2mF Q — DF C map (see 
Afonine et al, 2012 for map details) and has significant steric 
clashes with neighboring residues. The NCS-related Ile5120 
residue, however, is fitted as a pt rotamer and is an excellent fit 
to both the density map and the local steric environment (data 
not shown). The automated rotamer consistency method first 
rotates the x angles of IleA120 to match those of the pt 
rotamer observed for Ile£120 (Fig. 2b) and performs a local 
/-angle and backrub conformation search (see above and 
Fig. 2c) in order to determine whether this is a likely confor- 



mation. After reciprocal-space refinement, the correct pt 
rotamer proves to be an improved fit to the density map and 
the local environment (Fig. 2d). Similarly, the AspA32 side 
chain, which begins in an outlier conformation, is corrected to 
the m— 20 rotamer conformation of Asp£32 using the same 
local search method (data not shown). 



2.2. Human-readable output 

When refining against a target that includes an adaptive 
geometric restraint, such as our flexible torsion NCS restraints, 
we note that it is important to provide informative feedback 
to the user about how such restraints were applied in a given 
refinement in order to quantify its impact on the resultant 
model. To this end, we provide summaries of all applied 
torsion NCS restraints in the comprehensive geometry 
restraints (GEO) files (available both before and after 
refinement). We also provide a residue-by-residue matching 
summary for matched residues, as well as step-by-step updates 
on the number of active torsion NCS restraints. Finally, we 
provide an NCS summary to the output PDB header, which 
includes details of each NCS group, the number of matched 
torsions in a given group and torsion-based r.m.s.d. values 
for all torsions above and below the set limit value. We also 

C '. ^ 

Adjust outlier side chain to match all torsions 

for candidate rotamer 

v J 

i 

C \ 

Sample 6° of 'backrub' conformations in 

each direction (Davis et al, 2006) 
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i 
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\ ) 

1 
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Figure 1 

Flow diagram of NCS-related automated rotamer correction. 
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include a histogram of torsion-angle differences between the 
model and target for those torsions above and below the limit. 

Providing such information allows detailed explanation 
of methods in structure publications where torsion NCS 
restraints are used. Through the iotbx.cif routines (Gildea et 
al, 2011), phenix.refine is also fully compliant with migration 
to mmCIF format (Westbrook & Fitzgerald, 2005), and will 
allow the propagation of explicit NCS restraint information. 



Table 1 

AutoBuild results for PDB entry lsar. 
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favored (%) 


Clashscore 


No NCS 


0.2170 


0.2537 


0.0367 


1.22 


100 


1.39 


Global NCS 


0.2385 


0.2703 


0.0318 


3.66 


100 


1.74 


Torsion NCS 


0.2174 


0.2501 


0.0327 


0.61 


100 


1.74 



2.3. Refinement at medium-high resolution: an in-depth 
example 

Because the torsion-based NCS restraints allow for local 
differences, including rotamer differences between NCS- 
related side chains, these restraints can be safely used even 
at high resolution without much risk of negatively impacting 
the refinement. To test the efficacy of our restraints in a real 
experimental context, we performed molecular replacement 
(MR) in Phaser (McCoy et al, 2007) with the data for RNAse 
S (PDB entry lsar; Sevcik et al, 1991), which are nominally at 
1.8 A resolution (>90% complete to 1.85 A resolution) using 







Figure 2 

NCS rotamer consistency correction for IleA120 in PDB entry ImOo at 1.7 A resolution, (a) Starting 
orientation of IleA120 in ImOo in the tt rotamer. Bad steric clashes (>0.4 A) are depicted in hot 
pink, (b) x angles adjusted to match the pt rotamer orientation of the NCS-related 1165120 side 
chain, (c) Result of the local conformation search, including backrub motion, shown in green, (d) 
Following a default run of phenix.refine, the correct pt rotamer is fitted to the density map. All maps 
are 2mF Q — DF C maps contoured at 1.2cr. Images were generated using KiNG (Chen et al, 2009). 



the A chain from a related 2.0 A resolution RNAse S structure 
(PDB entry lrsn; Sevcik et al, 2005) as a search model. 
Crystallographic symmetry operators and origin shifts were 
applied to the resultant MR solution using phenix.find_ 
alt_orig_sym_mate (Oeffner et al, 2012) to place the model in 
the same frame of reference as the deposited lsar model. 

AutoBuild (Terwilliger, 2002; Terwilliger et al, 2008) model 
rebuilding was then performed using three different protocols: 
no NCS in model refinement, global (Cartesian-based) NCS 
as part of model refinement and torsion NCS with rotamer 
correction and consistency checks as part of model refinement. 
Ordered water picking was disabled, and default settings were 
used otherwise. Using the traditional 
global NCS target, both Arg63 side 
chains are distorted to outlier confor- 
mations, while the flexible torsion-based 
NCS target allows these residues to 
reach and maintain valid rotamer states 
automatically (Fig. 3). Validation 
statistics are summarized in Table 1. 
Running AutoBuild without NCS and 
with torsion NCS produces final models 
with similar R factors, with the torsion 
NCS model having a slightly smaller 
^free-^work gap (Briinger, 1992), 
consistent with a less overfitted model. 
By comparison, the global NCS model 
has much higher R factors. There is one 
fewer rotamer outlier in the torsion 
NCS refined model, consistent with 
rotamer correction as part of the torsion 
NCS method. Compared with the 
deposited structure, the full-atom 
r.m.s.d.s for each AutoBuild model are 
0.543, 0.508 and 0.625 A for no NCS, 
torsion NCS and global NCS, respec- 
tively. Full-atom r.m.s.d.s were calcu- 
lated using VMD (Humphrey et al, 
1996). The clashscore is slightly elevated 
when using either NCS implementation: 
1.74 versus 1.39 when using no NCS 
restraints. Visual inspection reveals that 
the difference is a single clash between 
the CG atom of ArgA40 and the HA2 
atom of Gly561. These atoms clash to 
some degree in all three refinements, 
but the refinement with no NCS 
restraints produces a model in which the 
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Figure 3 

Comparison of Arg63 of PDB entry lsar following molecular replacement plus AutoBuild with 
refinement in phenix. refine, (a) Arg63 from the A chain. The final position is shown for lsar (dark 
blue), refinement with torsion NCS restraints (light blue), refinement without NCS restraints 
(green) and refinement with global NCS restraints (hot pink), with rotamer states indicated in 
matching colors, (b) Arg63 from the A chain. The final position is shown for PDB entry lsar (dark 
blue), refinement with torsion NCS restraints (light blue), refinement without NCS restraints 
(green) and refinement with global NCS restraints (hot pink), with rotamer states indicated in 
matching colors. All maps are 2mF Q — DF C maps contoured at la. Images were generated using 
KiNG (Chen et al, 2009). 



overlap is just below the cutoff of 0.4 A, resulting in the lower 
clashscore. 

The two NCS-related chains in the asymmetric unit exhibit 
conformational differences supported by the density, parti- 
cularly in the loop region surrounding Arg63. As seen in 
Fig. 4(a), ArgA63 is best fitted as an mtml80 rotamer, while 
Arg£63 is best fitted as a ptm— 180 rotamer (Fig. 4b), which is 
consistent with the rotamers observed in the deposited lsar 
structure. 

2.4. Testing torsion NCS restraints in a typical MR workflow 
at moderate resolution 

To test the effectiveness of torsion-based NCS restraints in a 
typical molecular replacement-driven workflow, we randomly 
selected 56 protein structures from the PDB between 2.0 and 
3.0 A resolution which have between two and four NCS 
copies, no ligands and are between 100 and 300 amino acids in 
length. This data set is summarized in Supplementary Table 
SI 1 . We also required each structure to have a homologue with 
sequence similarity of >90% but <100% for use for molecular 
replacement. We required this high level of similarity to limit 
the need for any manual rebuilding, allowing us to test the 
effectiveness of torsion NCS restraints in a fully automated 
mode of operation within Phenix. Once phased using mole- 
cular replacement in Phaser (McCoy et al., 2007), MR solu- 
tions were placed in the same frame of reference as 
the deposited PDB entry using phenix. alt _orig_sym_mate 
(Oeffner et al, 2012), and were then processed with AutoBuild 
(Terwilliger, 2002; Terwilliger et al, 2008) using the rebuild-in - 



1 Supporting information has been deposited in the IUCr electronic archive 
(Reference: RR5054). 



place option with no NCS in refinement 
and no placing of waters. Following 
AutoBuild, models were refined using 
phenix.refine for ten macro-cycles, 
refining individual sites and individual 
ADPs, and optimizing target weights for 
xyz sites. Each refinement was repeated 
with no NCS, global NCS and torsion 
NCS restraints. As shown in Fig. 5(a), 
the use of torsion NCS restraints and 
rotamer correction generally results in 
the same or a lower R free value when 
compared with using no NCS restraints. 
By comparison, global NCS restraints 
often result in much larger values of 
R free , often coupled with significant 
distortions of the model. In a handful of 
cases the global NCS restraints result in 
a slightly lower R free value, but visual 
examination of these models reveal no 
significant structural differences 
between these models and those refined 
with torsion NCS restraints. We chose to 
report residual R free values, i? free (NCS) 
— R free (no NCS), rather than absolute 
i? free values because at this early stage of refinement the trend 
in i? free is more revealing than its absolute magnitude. 
Refinements would need to be completed, including building 
the handful of missing side chains and placing any ordered 
solvent and/or ions, for comparison with published R free values 
to be revealing. 

To test the relative contribution of the torsion NCS term 
and rotamer correction, we also ran these refinements with 
torsion NCS restraints alone and rotamer correction alone. 
As shown in Fig. 5(b), rotamer correction alone generally 
results in R free values greater than or equal to the combined 
approach, with 18/56 cases (~32%) resulting in a worse R free 
than using no NCS restraints at all. By comparison, using 
torsion NCS restraints alone produces results that correlate 
more closely with the combined approach, with only 4/56 cases 
(~7%) resulting in an R free value worse than using no NCS at 
all. Using both torsion NCS restraints and rotamer correction 
combined results in the most consistent results across this data 
set. 

These refinement results were also compared with refine- 
ments carried out using REFMAC5 (Murshudov et al, 2011). 
We ran REFMAC5 both with and without local NCS restraints 
to allow us to calculate internally consistent i? free (NCS) — 
R free (no NCS) values. As shown in Fig. 5(a), refinement in 
REFMAC5 using local NCS restraints exhibits the same trend 
of improvement in R free over refinement in REFMAC5 
without local NCS restraints as observed for the phenix.refine 
results. On average, the addition of local NCS restraints 
in REFMAC5 reduces R free by -0.62%, while torsion NCS 
restraints with rotamer correction in phenix.refine reduces 
R free by —0.47%. Both methods at worst produce the same 
R free as refinement without NCS restraints but for a handful of 
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Figure 4 

Comparison of the Arg63 loop in the A and B chains in the 1.8 A resolution RNAse S structure (PDB entry lsar). (a) Overlay of the Arg63 loop region 
from the A chain (blue) and B chain (green), illustrating the rotameric difference of Arg63 between the chains, (b) Arg63 from the A chain with 
2mF Q — DF C density map. (c) Arg63 from chain B with 2mF 0 — DF C density map. All maps are contoured at la. Images were generated using KiNG 
(Chen etal, 2009). 



cases (PDB entries lc03, 2fxk and 2o9f for torsion NCS with 
rotamer correction, 2o9f for REFMAC5). These results 
suggest that both NCS parameterizations are a suitable 
automated strategy for the moderate-resolution cases 
presented in this test set. 

Geometric validation metrics demonstrate similar results. 
As shown in Fig. 6(a), the rotamer outlier percentage from 



0.08 
0.07 
0.06 
0.05 




Figure 5 

(a) Plot of residual R free values for refinements of a set of 56 moderate-resolution structures using torsion 
NCS with rotamer correction (blue diamonds), global NCS (red squares) and local NCS restraints in 
REFMAC5 (Murshudov et al, 2011; green squares). The residual R free is calculated as i? free (NCS) — 
R free (no NCS). (b) Plot of residual R free values for refinements using torsion NCS restraints only (purple 
crosses), rotamer correction only (light blue dashed crosses) and torsion NCS with rotamer correction (blue 
diamonds). Data for both plots are plotted on the x axis in order of increasing residual R iree for refinement 
with torsion NCS with rotamer correction in phenix. refine. 



refinements using torsion NCS restraints with rotamer 
correction is similar to those from refinements with no NCS 
restraints (average of 1.41 and 1.43%, respectively), with many 
cases of a higher rotamer outlier percentage when using global 
NCS restraints (average of 2.62%) or REFMAC5 (average of 
2.64%). As shown in Fig. 6(b), torsion NCS restraints alone 
and rotamer correction alone exhibit a similar trend to that of 

the combined approach, with 
torsion NCS alone having an 
average outlier percentage of 
1.48% and rotamer correction 
alone having an average outlier 
percentage of 1.39%. 

Ramachandran analysis 
produces similar results. As 
shown in Fig. 7(a), refinement 
with no NCS, torsion NCS with 
rotamer correction or refinement 
with REFMAC5 all produce 
similar percentages of Rama- 
chandran outliers, with averages 
of 0.46, 0.42 and 0.52%, respec- 
tively. The results from the 
refinements using global NCS 
restraints are slightly worse, 
with an average Ramachandran 
outlier percentage of 0.59%. 
Fig. 1(b) illustrates that refine- 
ments with torsion NCS alone 
produce models with slightly 
better average Ramachandran 
outlier percentages than refine- 
ments with rotamer correction 
alone (0.39 versus 0.47%) but, 
like rotamer outlier percentages, 
the trend is similar. 
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As shown in Fig. 8(a), refinement using torsion NCS 
restraints with rotamer correction produces slightly elevated 
clashscores compared with refinement using no NCS restraints 
(averages of 3.10 and 2.97, respectively), while refinement 
using global NCS restraints causes elevated clashscores in 
many cases (average clashscore of 3.67). Refinements with 
REFMAC5 produce the highest clashscores across the test set, 
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Figure 6 

(a) Rotamer outlier percentage analysis for a set of 56 test refinements using torsion NCS with rotamer 
correction (blue diamonds), global NCS (red triangles), no NCS (orange circles) and REFMAC5 (green 
triangles), (b) Rotamer outlier percentage for torsion NCS restraints only (purple crosses), rotamer 
correction only (light blue dashed crosses) and torsion NCS with rotamer correction (blue diamonds). Both 
plots are sorted by increasing rotamer outlier percentage using torsion NCS restraints with rotamer 
correction. 
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Figure 7 

Comparison of NCS-related LeuA88 and LenB88 of PDB entry ljbb refined with and without torsion NCS 
restraints, (a) Refinement without NCS restraints allows the incorrectly built LeaB88 side chain to remain a 
tp rotamer while distorting the surrounding backbone geometry, (b) Refinement using torsion NCS 
restraints preserves similar backbone geometry, which causes the incorrectly built LeaB88 side chain to 
present as an outlier. Images were generated using KiNG (Chen et al, 2009). 



with an average of 4.39. Rotamer correction alone and torsion 
NCS restraints alone produce similar results to the combined 
approach (average clashscores of 3.01 and 3.05, respectively), 
with the slightly better performance by rotamer correction 
alone likely to be owing to an increased emphasis on not 
introducing steric clashes, coupled with fewer restraints on the 
overall model. As described in Chen et al. (2010), clashscore is 

defined as the number of steric 
clashes >0.4 A per 1000 atoms. 
These differences in clashscore, 
therefore, are minimal, but serve 
to show that in general the use of 
torsion NCS restraints results in a 
model approximately as good as, 
if not better than, those models 
refined with no NCS restraints, 
and are usually safe to use at this 
working resolution range, even 
very early in the refinement 
process. 

Interestingly, as shown in 
Fig. 5(b), of the three cases in 
which torsion NCS with rotamer 
correction results in higher R free 
values than with no NCS 
restraints at all, rotamer correc- 
tion alone corrects this problem 
in two cases (PDB entries lc03 
and 2o9f) and torsion NCS 
restraints alone corrects this 
problem in the other case (PDB 
entry 2fxk). Closer inspection of 
lc03 reveals that the model has 
perfect Ramachandran statistics 
for all refinements (Fig. 9), 
limiting the benefit of torsion 
NCS restraints on the backbone. 
The refinement without any NCS 
restraints produces the lowest 
rotamer outlier percentage for 
this example, suggesting that 
rotamer correction is too aggres- 
sive in this case and, combined 
with torsion NCS restraints, 
produces a poorer model. For 
2fxk, an improvement in rotamer 
outlier percentage with rotamer 
correction comes at the cost of 
an increase in the number of 
Ramachandran outliers, leading 
to an overfitted model, explaining 
the overall improvement for the 
torsion NCS only refinement. 
Finally, for 2o9f, all models fall 
into the bottom third of each 
geometry validation metric (third 
from last in the Ramachandran 
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outlier percentage), suggesting that the refinement of models 
that are quite far from the correct global minimum requires 



2 10.0 





Figure 8 

(a) Clashscore (Chen et al, 2010) analysis for a set of 56 test refinements using torsion NCS with rotamer 
correction (blue diamonds), global NCS (red triangles), no NCS (orange circles) and REFMAC5 (green 
triangles), (b) Clashscore analysis for torsion NCS restraints only (purple crosses), rotamer correction only 
(light blue dashed crosses) and torsion NCS with rotamer correction (blue diamonds). Both plots are sorted 
by increasing rotamer outlier percentage using torsion NCS restraints with rotamer correction. 




Figure 9 

(a) Ramachandran outlier percentage analysis for a set of 56 test refinements using torsion NCS with 
rotamer correction (blue diamonds), global NCS (red triangles), no NCS (orange circles) and REFMAC5 
(green triangles), (b) Ramachandran outlier percentage for torsion NCS restraints only (purple crosses), 
rotamer correction only (light blue dashed crosses) and torsion NCS with rotamer correction (blue 
diamonds). Both plots are sorted by increasing Ramachandran outlier percentage using torsion NCS 
restraints with rotamer correction. 



more concerted motions than are possible with the addition of 
simple restraints, and that the addition of too many restraints 

further limits the ability of the 
model to move towards this 
minimum. 

On occasion, the final model 
following refinement using 
torsion NCS restraints will have 
a slightly higher rotamer outlier 
percentage or clashscore than a 
comparable model refined using 
no NCS restraints. In our experi- 
ence, this almost always indicates 
an area of the model that requires 
concerted rebuilding beyond 
the capacity of current automated 
refinement methods. For 
example, from the 56 models 
selected for this test, the final 
torsion NCS-refined model for 
the 2.0 A resolution ubiquitin- 
conjugating enzyme structure 
(PDB entry ljbb; VanDemark et 
al., 2001) has a rotamer outlier 
percentage of 1.49% (a total of 
four outliers) versus a default- 
refined model outlier percentage 
of 1.12% (a total of three 
outliers), coupled with an 
elevated clashscore (2.85 versus 
2.65). The difference is an outlier 
for LeuB88 using torsion NCS 
restraints versus a tp rotamer 
when using no NCS restraints. 
While the LeuA88 side chain is an 
mt rotamer, the model around 
the side chain is too distorted 
for either the rotamer outlier or 
rotamer consistency routines to 
correct this change. As shown in 
Fig. 7, however, the use of torsion 
NCS restraints is able to refine 
to similar backbone orientations 
between the A and B chain, 
causing the incorrect side chain to 
stand out as an outlier. Using no 
NCS restraints, this side chain 
distributes the error across the 
local backbone, refining to a 
false-positive tp rotamer (Figs, la 
and 10a). The clashscore is also 
eased by distributing the error 
across the backbone, explaining 
the higher observed clashscore 
with torsion NCS restraints. 
The (pity values around LeuA88 
(-138.5°, 140.3°) are quite 
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different from those around Leu£88 (-155.5°, 125.9°) when 
refined with no NCS. Conversely, the (p/x/r values are quite 
similar when refined using flexible torsion NCS restraints 
[(-145.5°, 134.3°) and (-147.9°, 130.7°)]. Outliers such as 
these can be corrected using more aggressive refinement 
methods or through simple rebuilding in a graphical building 
program such as Coot (Emsley et al., 2010). In this case, the 
side chain is corrected to an mt rotamer using Coot (Fig. 10b), 
and subsequent refinement confirms that this is a preferable 
rotamer for this side chain (Fig. 10c). 

Following five additional macro-cycles of refinement, 
the torsion NCS-refined model with corrected LeirB88 has 
improved R WO rJRfree values (0.1942/0.2441) compared with the 
model with corrected LeuB88 refined with no NCS restraints 
(0.2015/0.2503). The final rotamer outlier percentages favor 
the torsion NCS-refined model (1.12 versus 1.49%). 



2.5. Testing re-refinement at a wide range of resolutions 

To test the safety and efficacy of using torsion NCS 
restraints at a wide range of resolutions, we selected a set of 
deposited PDB structures ranging from 1.0 to 4.1 A resolution 
roughly in increments of 0.1 A. Structures were chosen that 
had between two and six NCS copies, usable structure factors 
and R free sets, and no ligands. This data set is summarized in 
Supplementary Table S2. Each structure was re-refined using 
phenix.refine for five macro-cycles, refining individual sites 
(with weight optimization) and individual ADPs. The results 
are summarized in Supplementary Fig. SI. 

The expectation in re-refining deposited models is that they 
are already well re-refined, and it is unlikely that simple 
refinement will greatly change the model or validation statis- 
tics. As shown in Supplementary Fig. SI (a), R free values 
generally improved slightly upon re-refinement (compared 
with the value for the deposited model), with the exception 



being significant increases when using global NCS restraints 
in some cases (PDB entries 3d95, 4dov, 4ilj and 2vr9). The 
average change in R free using no NCS parameterization is 
—0.007, with the largest decrease in R free being —0.047 for 
PDB entry lxdv (4.1 A) and the largest increase in R free being 
+0.024 for PDB entry 4i6p (2.9 A). By comparison, the use 
of global NCS restraints often results in an increased R free 
compared with no NCS restraints, with an average increase of 
+0.016. The use of torsion NCS restraints alone resulted in 
an average decrease in R free of —0.002, while the use of NCS- 
related rotamer correction resulted in an average decrease of 
—0.001. When combined together, torsion NCS restraints with 
rotamer correction results in an average decrease of —0.003. 
These changes, while subtle, support the claim that the use of 
torsion NCS restraints, particularly in conjunction with NCS- 
related rotamer correction, are safe to use across a wide 
resolution range, resulting in models that are similar or slightly 
better than those derived from refinement with no NCS 
restraints. 

To further validate these results, we also looked at rotamer 
outlier percentage (Supplementary Fig. Sib), Ramachandran 
outlier percentage (Supplementary Fig. Sic) and clashscore 
(Supplementary Fig. Sid). Compared with the PDB-deposited 
models, refinement with phenix.refine with no NCS results in 
an average decrease in the rotamer outlier percentage of 
—2.34%, while refinement with global NCS results in an 
average decrease of only —0.24%. Refinement with torsion 
NCS alone, NCS-related rotamer correction alone and torsion 
NCS combined with rotamer correction results in average 
decreases of —2.41, —2.54 and —2.54%, respectively, when 
compared with the PDB-deposited models. 

Ramachandran analysis of these results demonstrates that 
refinement with no NCS results in an average change in 
Ramachandran outlier percentage of —1.13% when compared 
with the deposited models in the PDB, while refinement with 




Figure 10 

Handling of Lea688 in the refinement of PDB entry ljbb (2.0 A resolution), (a) Following ten macro-cycles of phenix.refine, LeirB88 refines to an 
incorrect tp rotamer (no NCS restraints, shown in green) or an outlier (torsion NCS restraints, shown in pink). Simple rotation to a correct mt rotamer 
(purple) does not improve the density fit sufficiently for acceptance by the rotamer correction routine, (b) Correct placement of the mt rotamer (blue) 
following 'autofit best rotamer' using Coot (Emsley et al, 2010). (c) Following five additional macro-cycles of phenix.refine using torsion NCS restraints, 
the correct mt rotamer remains and the positive mF Q — DF C density peak is eliminated. 2mF Q — DF C maps (gray mesh) are contoured at 1.2a. mF Q — DF C 
maps (green peak) are contoured at 3.5a. Images were generated using KiNG (Chen et al, 2009). 
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global NCS restraints results in an average change of —0.63%. 
Refinement with torsion NCS alone, NCS-related rotamer 
correction alone and torsion NCS combined with rotamer 
correction results in average changes of —1.12, —1.15 and 
—1.10%, respectively, when compared with the PDB-depos- 
ited models. 

Clashscore analysis of these results demonstrates that 
refinement with no NCS results in an average change in 
clashscore of —12.44 when compared with the deposited PDB 
models, while refinement with global NCS restraints results 
in an average change of —9.63. Refinement with torsion NCS 
alone, NCS-related rotamer correction alone and torsion NCS 
combined with rotamer correction result in average changes in 
clashscore of —11.98, —12.25 and —11.89, respectively, when 
compared with the PDB-deposited model. 

Overall, these validation results are consistent with our 
observation that refinement with torsion NCS restraints with 
rotamer correction produces models as good as or better than 
those produced by refinement with no NCS. Refinement with 
global NCS, on the other hand, is not as successful in 
improving the PDB-deposited models, producing models with 
worse validation statistics than refinement with no NCS 
restraints or with torsion NCS with rotamer correction. 

3. Discussion 

In this manuscript, we introduce flexible torsion-based NCS 
restraints for refinement of crystallographic structures of 
macromolecules. We select torsion space as a parameteriza- 
tion space for these restraints, as torsion angles have a strong 
correlation to well characterized structural features such as 
amino-acid side-chain rotamers and RNA backbone confor- 
mations. Further, using torsion angles allows a minimal set of 
restraints to fully describe the in-sequence NCS relationship. 
By comparison, methods that use interatomic distances to 
restrain local NCS relationships do so by capturing longer 
range spatial relationships between NCS-related copies, but 
are likely to require the addition of more restraints to achieve 
the same coverage of fold-space as achieved with our torsion- 
based approach. Further, distance-based restraints do not 
directly relate to torsion-based folding expectations, such as 
side-chain rotamers and backbone conformations in proteins. 
As a result, rotameric errors may be preserved or introduced 
by restraining local distances, which is supported by the higher 
rotamer outlier percentages observed in the refinements 
carried out using REFMAC5 with local NCS restraints carried 
out in this study. 

By using a flexible target function, we allow local differ- 
ences between NCS-related chains while maintaining a fully 
automated functionality. This parameterization avoids the 
need for any manual definition of NCS groups and is inde- 
pendent of the Euclidean relationship between related chains. 
Rotamer correction and consistency algorithms allow auto- 
mated correction of modeling errors in many cases, reducing 
the need for manual rebuilding and decreasing the necessary 
number of total refinement macro-cycles. Further, as shown in 
the example of Leu88 in PDB entry ljbb, the use of torsion 



NCS restraints at early stages of refinement can cause incor- 
rectly built side chains to remain as outliers even after several 
rounds of refinement, allowing identification and correction 
early in the refinement process. Clashscores, in particular, are 
observed to be elevated when using NCS restraints, particu- 
larly in cases where there are still a significant percentage of 
uncorrected rotamer outliers, where the increased rigidity 
introduced through torsion restraints prevents these errors 
from being distributed across the backbone. The positive 
aspect of concentrating these errors is that it simplifies the 
identification of problem areas, directing the crystallographer 
to the areas of a model that need the most manual rebuilding. 

One outstanding limitation for both torsion-based and 
global, Cartesian-based, NCS approaches is that in order for 
true differences between NCS-related copies to be properly 
refined differences must be introduced in the model prior to 
refinement. If NCS-related models are exactly the same, then 
the NCS contribution to the geometry target will be zero, 
which is a minimum that simple refinement is unlikely to 
escape. This situation commonly arises in the case of mole- 
cular replacement where the same search model is used to fit 
all chains. Some method of randomization, whether it be 
minimization, simulated annealing or automated model 
building (such as Auto Build), typically needs to be employed 
as part of the initial modeling process for NCS restraints to 
be used and still allow conformational differences between 
related chains. Thus, we recommend that models containing 
NCS-related chains which are solved by molecular replace- 
ment at sufficiently high resolution be rebuilt with AutoBuild 
with NCS refinement disabled to allow initial differences to be 
introduced prior to application of an appropriate refinement 
strategy for the working resolution. In the future, methods 
that combine refinement and local rebuilding could provide an 
alternative approach to automatically breaking NCS. 
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